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Abstract — In  this  paper,  we  have  proposed  vocalization  of 
heart  rate  variability  (HRV)  as  a  perceptual  analysis  tool. 
We  adapted  a  phonation-production  model  to  encode  ex¬ 
ternal  signals  and  generate  audible  representations  of  them. 
HRV  changes,  caused  by  induced  perturbations  to  the  au¬ 
tonomous  nervous  system,  could  be  perceived  on  vocalized 
HRV. 
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I.  INTRODUCTION 

To  convey  information,  numerical  data  should  be  appro¬ 
priately  mapped  to  our  sensory  world.  As  a  prominent  ex¬ 
ample,  computer  visualization,  which  exploits  our  complex 
vision  system,  has  successfully  helped  to  ’make  sense’  of 
vast  amount  of  numerical  data.  In  the  same  vein,  computer 
vocalization  could  engage  another  subtle  physiological  tool 
namely  the  auditory  system. 

The  perception  of  perturbations  on  sustained  phonation 
has  provided  diagnostic  information  related  to  voice  dis¬ 
orders.  For  this,  various  perceptual  voice  quality  scales, 
such  as  GRBAS  scale  [l](with  grades  ’hoarseness’,  ’rough’, 
’breathy’,  ’asthenia’,  and  ’strained’),  and  other  voice  char¬ 
acteristics,  such  as  ’rough’,  ’creak’,  ’fry’  ,  ’falsetto’,  and 
combinations  thereof  are  being  used. 

The  perceptual  characteristics  of  the  phonation  are 
found  to  be  related  to  the  amplitude  and  pitch  of  the  per¬ 
turbations  and  to  the  power  of  laryngeal  noise  [2].  De¬ 
pending  on  the  characteristics  of  signal  perturbation,  pitch 
perturbations  can  produce  ’  natural’  ,  ’fry’  ,  ’creak’  and 
pitch- varying  phonations.  If  perturbation  consists  of  mul¬ 
tiple  frequency  components,  the  sound  will  be  perceived  as 
’polyphonic.’  On  the  other  hand,  amplitude  perturbation  is 
perceived  as  ’loudness’  shimmer  whilst  ’laryngeal’  noise  as 
’breathiness’  or  ’hoarsennes’  [3].  Changes  on  shape  of  ’glot¬ 
tal’  pulses  produces  sounds  with  different  scales  of  ’natural¬ 
ness’  and  ’quality’.  Modification  of  vocal  tract  parameters 
produces  different  phonemes  or  phoneme-like  sounds  [3]. 

The  HRV  is  similar  to  the  perturbations  of  the 
phonations — both  signals  are  perturbations  to  physiolog¬ 
ical  rhythms  and  carry  useful  clinical  information  — with  a 
perceptual  difference,  the  HRV  is  not  audible. 

Here  we  have  propose  vocalization  as  a  process  of  coding 
digital  signals  to  the  parameter  space  of  a  voice-synthesis 
model.  The  model  is  called  the  vocalization  model  (VM) 
and  the  audible  sound  generated  by  it  the  vocalized  sig- 
nal(VS).  The  coding  scheme  needs  to  consider  both  the 
perception  characteristics  of  auditory  system  and  the  char¬ 
acteristics  of  data  to  be  vocalized. 

Signals  vocalization  could  add  a  new  perceptual  dimen¬ 
sion  to  the  biomedical  (or  other)  data.  Anyhow,  some  prob¬ 
lems  that  would  need  to  be  addressed  are:  which  VM  would 
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Fig.  1.  Vocalization  model 


be  more  appropriate  for  a  given  class  of  signals?  Which 
type  of  coding  to  choose?  Which  VM  could  assist  pattern 
recognition?  Which  coding  could  provides  better  percep¬ 
tual  information? 

This  paper  presents  the  VM  and  addresses  the  last  ques¬ 
tions  by  an  application  from  the  HRV  analysis,  which  is 
an  important  noninvasive  tool  that  sheds  light  on  the  au¬ 
tonomous  nervous  system(ANS)  control  of  heart  rate  [4] 
and  provides  valuable  clinical  information  [5]. 

We  would  let  the  HRV  signals  to  modify  the  pitch  pe¬ 
riod  of  the  VS  in  VM.  After  some  practice  in  listening  to 
the  vocalized  HRV,  we  could  differentiate  amongst  various 
states  of  HRV  induced  respectively  by  the  parasympathetic 
and  sympathetic  blockade  on  healthy  subjects. 

II.  MATERIAL  AND  METHOD 
A.  Vocalization  Model 

Figure  1  shows  the  block  diagram  of  the  vocalization 
model.  In  difference  to  speech-synthesis  models,  which  usu¬ 
ally  modifies  the  vocal  tract  and  switch  between  voiced  and 
unvoiced  sources  [6],  [7],  the  VM  was  basically  a  sustained- 
phonation  synthesis  model,  which  modified  primarily  the 
voiced  source — and  could  also  provide  for  additive  unvoiced 
source. 

The  voiced  source  was  modelled  by  a  generator  of  Kro- 
necker  delta  pulses  followed  by  Rosenberg’s  glottal  pulse[8] 
shaper(Fig.  1).  The  vocal  tract  was  modelled  by  an  auto¬ 
regressive  model  of  order  twelve  and  lips’  radiation  by  a 
high  pass  filter  ( H(z~1 )  =  1  —  0.99z_1).  Laryngeal  noise 
or  unvoiced  source  was  modelled  by  a  white  noise  gener¬ 
ator.  The  coding  unit  coded  one  or  more  input  signals 
to  time- varying  parameters  of  the  other  blocks.  The  VM 
depicted  in  Fig.  1  has  various  degrees  of  freedoms.  The 
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(a) 


frequency  (eq.Hz) 

(b) 

Fig.  2.  Short  term  HRV  signal(a)  and  its  power  spectral  density  (b) 
of  a  healthy  female  subject  in  rest  position. 

amplitude  (i)  and  the  period  (ii)  of  impulses  (voiced  source 
block)  can  be  modulated  either  individually  or  in  combina¬ 
tion.  Glottal  pulse  shapes(iii)  can  be  varied  parametrically. 
The  unvoiced  source  could  be  scaled(iv)  and  correlated(v) 
by  filters  whose  parameters  could  vary  smoothly.  The  for- 
mants(vi)  (vocal  tract  block)  could  also  be  modified  para¬ 
metrically  or  otherwise. 

Thus,  variations  of  pitch,  loudness,  and  ’laryngeal’  noise, 
and  possibly  ’glottal’  shaping  and  ’vocal  tract’  changes, 
could  produce  a  rich  variety  of  sounds  for  the  trained  ear. 

B.  HRV  Signals 

The  HRV  data  were  used  here  were  obtained  from  a 
group  of  normal  subjects  that  participated  in  a  study  on 
the  influence  of  the  ANS  on  HRV.  The  HRV  signals  with 
length  of  about  six  to  seven  minutes  were  obtained  from 
subjects  on  supine  and  tilt  positions  with  and  without  in¬ 
duced  parasympathetic  and  sympathetic  autonomic  block¬ 
ade  as  described  in  [9]. 

A  typical  short-term  HRV  signal,  obtained  from  a  female 
subject  in  rest  position,  is  shown  in  Fig.  2(a);  the  power 
spectral  density  (PSD)  of  the  detrended  signal  is  shown 
in  Fig.  2(b).  The  PSD  consists  typically  of  three  main 
frequency  components  [10],  [11]  and  has  a  maximum  fre¬ 
quency  extent  of  few  Hertz —  significantly  lower  than  the 
lowest  frequency  on  the  audible  frequency  range.  Fig.  3 
shows  the  HRV  signal  and  its  PSD  from  the  same  subject 
in  rest  position  with  induced  parasympathetic  blockade. 

C.  Voiced  Source  Coding 

We  confined  the  discussion  to  voiced  source  coding.  To 
construct  the  VM  model  we  first  obtained  the  mean  pitch 
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Fig.  3.  Short  term  HRV  signal(a)  and  its  PSD(b)  of  the  same  subject 
with  induced  parasympathetic  blockade. 


Fig.  4.  A  segment  of  vocalized  HRV 


period  of  a  phonation  /a/  (sampled  at  10  kHz)  from  a  male 
subject  and  estimated  the  vocal  tract  parameters  as  de¬ 
scribed  previously.  The  voiced  source  had  the  pitch  above. 

Second,  the  period  of  voiced  source  was  modulated  by 
the  HRV  signal.  The  period  variation  of  the  voiced  source 
was  related  to  the  HRV  signal  by 

A  p 

Av  =  C - I'm  (1) 

Pm 

where  pm  and  vm  are  respectively  the  sample  means  of 
the  HRV  signal  and  the  pitch  period  of  the  voiced  source, 
whereas  A p  =  p  —  pm  and  An  =  v  —  um  denote  respec¬ 
tively  deviations  from  the  corresponding  means.  The  con¬ 
stant  c  is  a  subjective  scale  factor.  Passing  the  source  signal 
through  the  ’glottal’  pulse  shaper  and  the  vocal  tract  gen¬ 
erates  the  VS.  A  segment  of  the  a  vocalized  HRV  signal  is 
shown  in  Fig.  4. 
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III.  Results 

We  have  proposed  a  vocalization  model  and  applied  it  to 
vocalize  HRV  data  obtained  from  group  of  normal  patients 
under  various  experimental  settings.  The  vocalization 
model  was  included  in  a  HRV  evaluation  system  [12]  where 
comparative  perceptual  assessment  of  HRV  was  obtained 
by  selectively  listening  to  HRV  signals  in  various  windows. 
After  some  practice  we  could  discern  few  auditory  pat¬ 
terns  on  vocalized  short-term  HRV  signals  related  to  pos¬ 
tural  changes  and  drugs  effects  on  the  parasympathetic- 
sympathetic  balance  of  the  ANS. 

IV.  DISCUSSION 

The  vocalization  model  was  found  quite  flexible  and 
could  be  applied  generally  to  various  types  of  signals.  Any¬ 
how,  at  present,  its  main  limitation  was  its  utility.  As  an 
example,  although  the  vocalized  HRV  offered  many  audi¬ 
tory  clues  to  both  short-term  and  long-term  HRV  record¬ 
ings,  to  make  use  of  them,  one  should  need  to  establish 
perceptual  categories,  like  the  scales  used  in  pathological 
voice  assessment  [1],  and  train  physicians  accordingly. 

We  discussed  here  a  simple  coding  scheme,  that  of  the 
single-source  VM  based  on  the  modulation  of  a  single  pa¬ 
rameter.  Other  coding  schemes  could  yield  quite  different 
perceptual  information  whereas  the  multi-parameters  cod¬ 
ing  of  a  multi-sources  VM  could  create  a  high  dimensional 
auditory  space. 

An  advantage  we  could  observe  from  the  application  on 
long-term  HRV  analysis,  was  that  long-term  pitch  variation 
as  well  as  local  alterations  of  the  pitch  could  be  perceived 
concurrently.  In  difference,  global  on-screen  visualization 
of  such  data  suffered  from  spacial  aliasing  (due  to  limits  on 
monitor  resolutions)  whereas  on  local  visual  scanning  we 
needed  to  refer  to  values  on  the  ordinate  axis  to  follow  up 
long-term  variations. 

The  combined  visual  and  auditory  perception  of  same  or 
related  data,  could  provide  a  more  natural  access  to  such 
information,  probably  with  a  synergistic  effect.  For  ex¬ 
ample  scanning  concurrently  the  HRV  data,  both  visually 
(HRV  signal)  and  audibly  (vocalized  signal),  could  provide 
a  different  mental  picture  of  HRV  which  could  facilitate  the 
recognitions  of  patterns. 

The  results  presented  are  preliminary.  The  full  extent  of 
the  HRV  application  would  need  to  be  further  evaluated 
and  the  vocalization  model  to  be  accordingly  adjusted. 

In  conclusion,  the  vocalization  model,  as  demonstrated 
by  the  application  on  heart  rate  variability,  provided  an  ad¬ 
ditional  tool  to  data  analysis.  Using  it,  we  were  able  to  per¬ 
ceive  auditory  patterns  on  vocalized  heart  rate  variability 
signals  caused  by  shifts  on  parasympathetic-sympathetic 
balance  of  the  autonomous  nervous  system. 
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